AITopics | image generation

fa64505ebdc94531087bc81251ce2376-Paper-Conference.pdf

Neural Information Processing SystemsMay-1-2026, 06:35:29 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

fa64505ebdc94531087bc81251ce2376-Supplemental-Conference.pdf

Neural Information Processing SystemsMay-1-2026, 05:15:24 GMT

In this work, we investigate the task of text-to-image (T2I) synthesis under the abstract-to-intricate setting, i.e., generating intricate visual content from simple abstract text prompts. Inspired by human imagination intuition, we propose a novel scene-graph hallucination (SGH) mechanism for effective abstract-to-intricate T2I synthesis. SGH carries out scene hallucination by expanding the initial scene graph (SG) of the input prompt with more feasible specific scene structures, in which the structured semantic representation of SG ensures high controllability of the intrinsic scene imagination. To approach the T2I synthesis, we deliberately build an SG-based hallucination diffusion system. First, we implement the SGH module based on the discrete diffusion technique, which evolves the SG structure by iteratively adding new scene elements. Then, we utilize another continuous-state diffusion model as the T2I synthesizer, where the overt image-generating process is navigated by the underlying semantic scene structure induced from the SGH module. On the benchmark COCO dataset, our system outperforms the existing best-performing T2I model by a significant margin, especially improving on the abstract-to-intricate T2I generation. Further in-depth analyses reveal how our methods advance.2

Add feedback

Improving Diffusion-Based Image Synthesis with Context Prediction

Neural Information Processing SystemsMay-1-2026, 03:57:25 GMT

Diffusion models are a new class of generative models, and have dramatically promoted image generation with unprecedented quality and diversity. Existing diffusion models mainly try to reconstruct input image from a corrupted one with a pixel-wise or feature-wise constraint along spatial axes. However, such point-based reconstruction may fail to make each predicted pixel/feature fully preserve its neighborhood context, impairing diffusion-based image synthesis. As a powerful source of automatic supervisory signal, context has been well studied for learning representations. Inspired by this, we for the first time propose CONPREDIFF to improve diffusion-based image synthesis with context prediction.

artificial intelligence, diffusion model, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Overview (0.46)

Industry: Media (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Retrieval-Augmented Diffusion Models

Neural Information Processing SystemsMay-1-2026, 02:45:52 GMT

Novel architectures have recently improved generative image synthesis leading to excellent visual quality in various tasks. Much of this success is due to the scalability of these architectures and hence caused by a dramatic increase in model complexity and in the computational resources invested in training these models.

arxiv preprint arxiv, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report (0.68)

Industry: Transportation > Ground (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

f74054328beeb0c21a9b8e99da557f5a-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 08:25:47 GMT

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.95)
Questionnaire & Opinion Survey (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

Neural Information Processing SystemsApr-29-2026, 13:58:00 GMT

The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users. The intended generation can be expressed in natural language, with the model producing faithful interpretations of text prompts. However, expressing complex or nuanced ideas in text alone can be difficult.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)

Add feedback

Shape your Space: AGaussian Mixture Regularization Approach to Deterministic Autoencoders

Neural Information Processing SystemsApr-25-2026, 12:56:38 GMT

In this document, we provide additional details and results to the main paper. The document is structured as follows: A.1 Loss Analysis - Analysis of the unimodal and multimodal latent regularization loss across different distributions and an ablation study on the proposed loss function. A.2 Image Generation - In this section, we compare VQVAE model with our method, provide detailed descriptions of the dataset, network architecture, and implementation details of the image generation experiments in the main paper. A.3 Modelling Discrete Structures - In this section, we describe the experimental and implementation details of the discrete data structure experiments in the main paper. A.5 Additional Qualitative Analysis - More examples of the randomly generated samples of MNIST, FASHIONMNIST, SVHN and CELEBA images.

artificial intelligence, dataset, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

1a675d804f50509b8e21d0d3ca709d03-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 11:52:22 GMT

computational linguistic, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > Canada > Quebec (0.28)

Genre: Research Report > New Finding (0.95)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Sensing and Signal Processing > Image Processing (0.71)

Add feedback

15f1dbc086bfd94d8c32557b573cbe18-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 06:01:43 GMT

artificial intelligence, generator, machine learning, (17 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Visual Programming for Text to Image Generation and Evaluation

Neural Information Processing SystemsApr-25-2026, 02:56:31 GMT

As large language models have demonstrated impressive performance in many domains, recent works have adopted language models (LMs) as controllers of visual modules for vision-and-language tasks. While existing work focuses on equipping LMs with visual understanding, we propose two novel interpretable/explainable visual programming frameworks for text-to-image (T2I) generation and evaluation. First, we introduce VPGEN, an interpretable step-by-step T2I generation framework that decomposes T2I generation into three steps: object/count generation, layout generation, and image generation. We employ an LM to handle the first two steps (object/count generation and layout generation), by finetuning it on textlayout pairs. Our step-by-step T2I generation framework provides stronger spatial control than end-to-end models, the dominant approach for this task.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment (0.46)

Technology: